Skip to content

[SPARK-57778][SQL] Handle Oracle JDBC objects that are not selectable as tables#56898

Open
ivan-leventsov wants to merge 2 commits into
apache:masterfrom
ivan-leventsov:oracle-skip-non-selectable-objects
Open

[SPARK-57778][SQL] Handle Oracle JDBC objects that are not selectable as tables#56898
ivan-leventsov wants to merge 2 commits into
apache:masterfrom
ivan-leventsov:oracle-skip-non-selectable-objects

Conversation

@ivan-leventsov

Copy link
Copy Markdown

What changes were proposed in this pull request?

When using the built-in JDBC TableCatalog over Oracle, the driver's table listing returns objects that cannot be read as tables, for example a synonym that resolves to a PL/SQL procedure/function/package, or an invalid view. Probing such an object (tableExists via SELECT 1 FROM <obj> WHERE 1=0, or schema resolution via SELECT * FROM <obj> WHERE 1=0) raises ORA-04044 ("procedure, function, package, or type is not allowed here") or ORA-04063 ("... has errors"). These are currently unclassified, so tableExists throws instead of returning false, and table resolution surfaces a raw FAILED_JDBC error.

This PR adds a new JdbcDialect.isNotSelectableObjectException predicate (default false; OracleDialect recognizes ORA-04044/ORA-04063), kept separate from isObjectNotFoundException ("object does not exist"). With it:

  • JdbcUtils.tableExists returns false for such objects instead of throwing;
  • JDBCRDD.resolveTable throws a dedicated, clear error condition JDBC_OBJECT_NOT_SELECTABLE instead of a generic failure.

Why are the changes needed?

A schema legitimately contains synonyms/views that are not selectable as tables. Probing them should not surface a raw external-engine error or make existence checks throw; such objects should be reported as not a readable table.

Does this PR introduce any user-facing change?

Yes. For a JDBC catalog over Oracle:

  • reading an object that is not selectable as a table now fails with the dedicated error condition JDBC_OBJECT_NOT_SELECTABLE (SQLSTATE 42000) instead of a raw FAILED_JDBC error;
  • tableExists returns false for such an object instead of throwing.

How was this patch tested?

New cases in org.apache.spark.sql.jdbc.v2.OracleIntegrationSuite (docker-integration-tests), covering a synonym to a procedure, a synonym to a broken function (both ORA-04044), and an invalid view (ORA-04063). Each asserts that tableExists returns false and that reading the object fails with JDBC_OBJECT_NOT_SELECTABLE.

Was this patch authored or co-authored using generative AI tooling?

Yes, co-authored using Cursor (AI assistant).

@ivan-leventsov ivan-leventsov force-pushed the oracle-skip-non-selectable-objects branch from 30f2dac to 54139ed Compare June 30, 2026 09:45

@MaxGekk MaxGekk left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0 blocking, 1 non-blocking, 0 nits.
Clean Oracle error-classification extension that faithfully mirrors the existing isObjectNotFoundException predicate pattern (dialect default + Oracle override + tableExists/resolveTable wiring), kept deliberately separate so "not found" and "not selectable" stay distinct conditions.

Suggestions (1)

  • JdbcDialects.scala:811: @Since("4.2.0") looks stale — current branch-4.x is 4.3.0, so a method added now first ships there — see inline.

Verification

  • Predicate disjointness (the key correctness point): OracleDialect overrides isObjectNotFoundException to match only ORA-00942/ORA-39165, so it does NOT intercept ORA-04044/ORA-04063. The new isNotSelectableObjectException branch in resolveTable is therefore reachable (not dead), and it precedes the isSyntaxErrorBestEffort (42000) case so a 42000 SQLState can't divert it. tableExists returns false for these objects.
  • Default false leaves non-Oracle dialects unaffected; the error condition is alphabetically placed with a matching <objectName> param and chained cause. Integration tests cover ORA-04044 (procedure + broken-function synonyms) and ORA-04063 (invalid view), asserting both tableExists=false and the dedicated error.

Option(e.getSQLState).exists(_.startsWith("42"))
}

@Since("4.2.0")

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dev/next_version_candidates.py reports the current branch-4.x at 4.3.0 (master 5.0.0), so a method added now first ships in 4.3.04.2.0 looks like it predates the recent version bump (the existing @Since("4.2.0") entries in this file are from the prior cycle).

Suggested change
@Since("4.2.0")
@Since("4.3.0")

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, fixed

… as tables

### What changes were proposed in this pull request?

When using the built-in JDBC `TableCatalog` over Oracle, the driver's table listing
returns objects that cannot be read as tables, e.g. a synonym that resolves to a PL/SQL
procedure/function/package, or an invalid view. Probing such an object (`tableExists` via
`SELECT 1 FROM <obj> WHERE 1=0`, or schema resolution via `SELECT * FROM <obj> WHERE 1=0`)
raises `ORA-04044` ("procedure, function, package, or type is not allowed here") or
`ORA-04063` ("... has errors"). These were unclassified, so `tableExists` threw and table
resolution surfaced a raw `FAILED_JDBC` failure.

This adds a `JdbcDialect.isNotSelectableObjectException` predicate (Oracle recognizes
`ORA-04044`/`ORA-04063`) so:
- `JdbcUtils.tableExists` returns `false` for such objects instead of throwing;
- `JDBCRDD.resolveTable` throws a dedicated, clear error (`JDBC_OBJECT_NOT_SELECTABLE`)
  instead of a generic failure.

### Why are the changes needed?

A schema legitimately contains synonyms/views that are not selectable tables; probing them
should not surface a raw external-engine error or make existence checks throw.

### Does this PR introduce any user-facing change?

Yes. Querying such an object now returns `JDBC_OBJECT_NOT_SELECTABLE` instead of a raw
`FAILED_JDBC` error, and `tableExists` returns `false` for it.

### How was this patch tested?

New cases in `OracleIntegrationSuite` (docker-integration-tests): synonym to a procedure,
synonym to a broken function, and an invalid view.
@ivan-leventsov ivan-leventsov force-pushed the oracle-skip-non-selectable-objects branch from 54139ed to be95c45 Compare June 30, 2026 10:19
@ivan-leventsov ivan-leventsov requested a review from MaxGekk June 30, 2026 10:20

@cloud-fan cloud-fan left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 blocking, 2 non-blocking.
Clean change that mirrors the existing isObjectNotFoundException classification pattern; the design (new predicate + dedicated JDBC_OBJECT_NOT_SELECTABLE condition) is the right call, and the new error propagates through loadTable's FAILED_JDBC.LOAD_TABLE wrapper without being masked. One blocking version-stamp fix.

Correctness (1)

  • JdbcDialects.scala:811: @Since("4.2.0") should be @Since("4.3.0") — see inline

Suggestions (2)

  • JdbcDialects.scala:812: add Scaladoc documenting the new opt-in hook — see inline
  • OracleDialect.scala:59: optional getErrorCode vs message-substring note — see inline

Verification

Checked that JDBC_OBJECT_NOT_SELECTABLE is the correct outcome rather than reusing TABLE_OR_VIEW_NOT_FOUND: the object exists (synonym→procedure for ORA-04044, INVALID view for ORA-04063), so the not-found message ("cannot be found, verify the spelling") would mislead, and the NoSuchTableException type it carries is silently tolerated by IF-EXISTS logic. Also traced propagation: JdbcUtils.classifyException rethrows SparkThrowable unchanged (JdbcUtils.scala:1382), so the new AnalysisException survives loadTable's wrapper. Branch ordering in resolveTable is correct — the new predicate is checked before isSyntaxErrorBestEffort, which matches Oracle SQLSTATE "42000" that ORA-04044/04063 can also carry.

}

@Since("4.3.0")
def isNotSelectableObjectException(e: SQLException): Boolean = false

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a general opt-in hook (base default false, like isObjectNotFoundException), but only Oracle overrides it today. Consider a short Scaladoc stating the contract, so the Oracle-only impl reads as the first adopter and the next dialect author knows when to override. Placed above the @Since line, e.g.:

  /**
   * Returns true if the given exception indicates the object exists but cannot be read as a
   * table or view (e.g. a synonym that resolves to a procedure, or an invalid view), as opposed
   * to not existing at all (see `isObjectNotFoundException`). Dialects override this to recognize
   * their own error codes; the default is false.
   */

(Left as an illustrative block rather than a one-click suggestion so it doesn't collide with the @Since fix above.)

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, fixed


override def isNotSelectableObjectException(e: SQLException): Boolean = {
// ORA-04044: object is not a table (e.g. a synonym to a procedure/function/package).
e.getMessage.contains("ORA-04044") ||

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional, non-blocking: matching on e.getMessage.contains(...) is fragile to message format/locale; e.getErrorCode == 4044 (and 4063) would be more robust. This mirrors the existing isObjectNotFoundException here, so it's consistent either way — only worth changing if you want to harden both together.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, makes sense, but I'd leave it as is in this PR and keep it consistent with the existing implementation to minimize risk and avoid potential side effects from this small change (for example, the behavior might differ across driver versions etc..)

@ivan-leventsov ivan-leventsov requested a review from cloud-fan June 30, 2026 15:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants